An LCS-based string metric

نویسنده

  • Daniel Bakkelund
چکیده

These notes presents a string similarity measure which is a metric in the mathematical sense. In particular, the triangle inequality holds for this metric. The metric is based on the longest common subsequence (LCS) measure, and the complexity of any sensible implementation will be no worse than O(n).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sparse LCS Common Substring Alignment

The “Common Substring Alignment” problem is defined as follows. The input consists of a set of strings S1, S2 . . . , Sc , with a common substring appearing at least once in each of them, and a target string T . The goal is to compute similarity of all strings Si with T , without computing the part of the common substring over and over again. In this paper we consider the Common Substring Align...

متن کامل

Two Algorithms for LCS Consecutive Suffix Alignment

The problem of aligning two sequences A and B to determine their similarity is one of the fundamental problems in pattern matching. A challenging, basic variation of the sequence similarity problem is the incremental string comparison problem, denoted Consecutive Suffix Alignment, which is, given two strings A and B, to compute the alignment solution of each suffix of A versus B. Here, we prese...

متن کامل

Semi-local longest common subsequences in subquadratic time

For two strings a, b of lengths m, n respectively, the longest common subsequence (LCS) problem consists in comparing a and b by computing the length of their LCS. In this paper, we define a generalisation, called “the all semi-local LCS problem”, where each string is compared against all substrings of the other string, and all prefixes of each string are compared against all suffixes of the ot...

متن کامل

Semi-local String Comparison: Algorithmic Techniques and Applications

The longest common subsequence (LCS) problem is a classical problem in computer science. The semi-local LCS problem is a generalisation of the LCS problem, arising naturally in the context of string comparison. Apart from playing an important role in string algorithms, this problem turns out to have surprising connections with computational geometry, algebra, graph theory, as well as applicatio...

متن کامل

All Semi-local Longest Common Subsequences in Subquadratic Time

For two strings a, b of lengths m, n respectively, the longest common subsequence (LCS) problem consists in comparing a and b by computing the length of their LCS. In this paper, we define a generalisation, called “the all semi-local LCS problem”, where each string is compared against all substrings of the other string, and all prefixes of each string are compared against all suffixes of the ot...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009